Parallelization of NAS Benchmarks for Shared Memory Multiprocessors
نویسندگان
چکیده
This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting the code to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks. !#"%$& (' "%)+*, *.-/ 1032% ( )4*, 65%718:9 ;%<%;%"% = =1 > ? > @A"%-B*DC:9 <%;%E%718%<%<%<
منابع مشابه
Parallelization of NAS Benchmarks for Shared Memory Multiprocessore
This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of port...
متن کاملPerformance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors
This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple mod...
متن کاملIntra node parallelization of MPI programs with OpenMP
The availability of multiprocessors and high performance networks ooer the opportunity to construct CLUMPs (Cluster of Multiprocessors) and use them as parallel computing platforms. The main distinctive feature of the CLUMP architecture over the usual parallel computers is its hybrid memory model (message passing between the nodes and shared memory inside the nodes). Some of the primary issues ...
متن کاملCharacterizing Shared-Memory Applications: A Case Study of the NAS Parallel Benchmarks
The objective of this report is to present our characterization of a shared-memory implementation of the NAS Parallel Benchmarks (NPB). This characterization is needed to support the design decisions of future shared-memory multiprocessors. This report presents two sets of characterization data; the rst set is the application characteristics that do not change from one hardware connguration to ...
متن کاملThe OpenMP Implementation of NAS Parallel Benchmarks and Its Performance
As the new ccNUMA architecture became popular in recent years, parallel programming with compiler directives on these machines has evolved to accommodate new needs. In this study, we examine the effectiveness of OpenMP directives for parallelizing the NAS Parallel Benchmarks. Implementation details will be discussed and performance will be compared with the MPI implementation. We have demonstra...
متن کامل